{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[![Dataflowr](https://raw.githubusercontent.com/dataflowr/website/master/_assets/dataflowr_logo.png)](https://dataflowr.github.io/website/)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "CUiFyiXKHovB"
   },
   "source": [
    "# [Module 6](https://dataflowr.github.io/website/modules/6-convolutional-neural-network/): Convolutions by examples\n",
    "\n",
    "We'll build our first Convolutional Neural Network (CNN) from scratch."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "7v7DN_UKHovE"
   },
   "source": [
    "## 1. Preparations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "eQmaHvupHovF"
   },
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "import math,sys,os,numpy as np\n",
    "from numpy.linalg import norm\n",
    "from matplotlib import pyplot as plt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "lkRdumG1HovK"
   },
   "outputs": [],
   "source": [
    "import torch\n",
    "import torchvision\n",
    "from torchvision import models,transforms,datasets"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "3JlQUj5IHovP"
   },
   "source": [
    "Download MNIST data on disk and convert it to pytorch compatible formating.\n",
    "\n",
    "```torchvision.datasets``` features support (download, formatting) for a collection of popular datasets. The list of available datasets in ```torchvision``` can be found [here](http://pytorch.org/docs/master/torchvision/datasets.html).\n",
    "\n",
    "Note that the download is performed only once. The function will always check first if the data is already on disk.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "Bv9h8isvHovQ"
   },
   "outputs": [],
   "source": [
    "root_dir = './data/MNIST/'\n",
    "torchvision.datasets.MNIST(root=root_dir,download=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "XW1c2YmiHovW"
   },
   "source": [
    "MNIST datasets consists of small images of hand-written digits. The images are grayscale and have size 28 x 28. There are 60,000 training images and 10,000 testing images."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "p9MA5fqrHovZ"
   },
   "outputs": [],
   "source": [
    "train_set = torchvision.datasets.MNIST(root=root_dir, train=True, download=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "yaLXqwxaHovg"
   },
   "source": [
    "Define and initialize a data loader for the MNIST data already downloaded on disk."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "uEyTgg0THovj"
   },
   "outputs": [],
   "source": [
    "MNIST_dataset = torch.utils.data.DataLoader(train_set, batch_size=1, shuffle=True, num_workers=1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "f12CxE8XHovo"
   },
   "source": [
    "For the current notebook, we can format data as _numpy ndarrays_ which are easier to plot in matplotlib. The same operations can be easily performed on _pytorch Tensors_."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "_UnuFlpWHovq"
   },
   "outputs": [],
   "source": [
    "images = train_set.data.numpy().astype(np.float32)/255\n",
    "labels = train_set.targets.numpy()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "RpkGWiaqHovz"
   },
   "outputs": [],
   "source": [
    "print(images.shape,labels.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "tPtQoS_-HowJ"
   },
   "source": [
    "## 2. Data visualization\n",
    "\n",
    "For convenience we define a few functions for formatting and plotting our image data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "F3lsBB7ZHowK"
   },
   "outputs": [],
   "source": [
    "# plot multiple images\n",
    "def plots(ims, interp=False, titles=None):\n",
    "    ims=np.array(ims)\n",
    "    mn,mx=ims.min(),ims.max()\n",
    "    f = plt.figure(figsize=(12,24))\n",
    "    for i in range(len(ims)):\n",
    "        sp=f.add_subplot(1, len(ims), i+1)\n",
    "        if not titles is None: sp.set_title(titles[i], fontsize=18)\n",
    "        plt.imshow(ims[i], interpolation=None if interp else 'none', vmin=mn,vmax=mx)\n",
    "\n",
    "# plot a single image\n",
    "def plot(im, interp=False):\n",
    "    f = plt.figure(figsize=(3,6), frameon=True)\n",
    "    plt.imshow(im, interpolation=None if interp else 'none')\n",
    "\n",
    "plt.gray()\n",
    "plt.close()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "vfxlah5eHowW"
   },
   "outputs": [],
   "source": [
    "plot(images[5000])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "VY5890I1Howd"
   },
   "outputs": [],
   "source": [
    "labels[5000]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "OEWipA5iHown"
   },
   "outputs": [],
   "source": [
    "plots(images[5000:5005], titles=labels[5000:5005])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "jyc9VagHHo0S"
   },
   "source": [
    "## 3. A simple classifier\n",
    "\n",
    "In this section we will construct a basic binary classifier.\n",
    "Our classifier will tell us whether a given image depicts a _one_ or an _eight_. \n",
    "\n",
    "We fetch all images from the _eight_ class and from the _one_ class."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "ogJld-3qJDpW"
   },
   "outputs": [],
   "source": [
    "n=len(images)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "Itlgssc9Ho0U"
   },
   "outputs": [],
   "source": [
    "eights=[images[i] for i in range(n) if labels[i]==8]\n",
    "ones=[images[i] for i in range(n) if labels[i]==1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "RGY2P44IHo0X"
   },
   "outputs": [],
   "source": [
    "len(eights), len(ones)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "CFmNNrKAHo0f"
   },
   "outputs": [],
   "source": [
    "plots(eights[:5])\n",
    "plots(ones[:5])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "3Mxs-x1pHo0l"
   },
   "source": [
    "We keep the first 1000 digits for the test set and we average all the remaining digits."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "MEA8brOGHo0n"
   },
   "outputs": [],
   "source": [
    "raws8 =  np.mean(eights[1000:],axis=0)\n",
    "\n",
    "plot(raws8)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "ipyVJA9OMUH-"
   },
   "source": [
    "We now do the same thing with the ones:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "Aa3a1Le2Ho07"
   },
   "outputs": [],
   "source": [
    "raws1 =  np.mean(ones[1000:],axis=0)\n",
    "\n",
    "plot(raws1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "p6UJ-NicMUIA"
   },
   "source": [
    "We built a 'typical representative' of the eights and a 'typical representative' of the ones. Now for a new sample from the test set, we compute the distance between this sample and our two representatives and classify the sample with the label of the closest representative.\n",
    "\n",
    "For the distance between images, we just take the pixelwise squared distance."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "3ZML33QNHo1G"
   },
   "outputs": [],
   "source": [
    "# sum of squared distance\n",
    "def sse(a,b): return ((a-b)**2).sum()\n",
    "\n",
    "# return 1 if closest to 8 and 0 otherwise\n",
    "def is8_raw_n2(im): return 1 if sse(im,raws1) > sse(im,raws8) else 0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "IMqqBDrMMUID"
   },
   "outputs": [],
   "source": [
    "nb_8_predicted_8, nb_1_predicted_8 = [np.array([is8_raw_n2(im) for im in ims]).sum() for ims in [eights[:1000],ones[:1000]]]\n",
    "\n",
    "nb_8_predicted_1, nb_1_predicted_1 = [np.array([(1-is8_raw_n2(im)) for im in ims]).sum() for ims in [eights[:1000],ones[:1000]]]\n",
    "\n",
    "# just to check \n",
    "print(nb_8_predicted_1+nb_8_predicted_8, nb_1_predicted_1+nb_1_predicted_8)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "CFUntUqdHo1e"
   },
   "source": [
    "<img src=\"https://upload.wikimedia.org/wikipedia/commons/thumb/2/26/Precisionrecall.svg/1024px-Precisionrecall.svg.png\" alt=\"Drawing\" style=\"width: 500px;\"/>\n",
    "\n",
    "source [wikipedia](https://en.wikipedia.org/wiki/Precision_and_recall)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "aaYe8malHo1f"
   },
   "outputs": [],
   "source": [
    "def compute_scores(nb_8_predicted_8,nb_8_predicted_1,nb_1_predicted_1,nb_1_predicted_8):\n",
    "    Precision_8 = nb_8_predicted_8/(nb_8_predicted_8+nb_1_predicted_8)\n",
    "    Recall_8 = nb_8_predicted_8/(nb_8_predicted_1+nb_8_predicted_8)\n",
    "    Precision_1 = nb_1_predicted_1/(nb_1_predicted_1+nb_8_predicted_1)\n",
    "    Recall_1 = nb_1_predicted_1/(nb_1_predicted_1+nb_1_predicted_8)\n",
    "    return Precision_8, Recall_8, Precision_1, Recall_1\n",
    "\n",
    "Precision_8, Recall_8, Precision_1, Recall_1 = compute_scores(nb_8_predicted_8,nb_8_predicted_1,nb_1_predicted_1,nb_1_predicted_8)\n",
    "\n",
    "print('precision 8:', Precision_8, 'recall 8:', Recall_8)\n",
    "print('precision 1:', Precision_1, 'recall 1:', Recall_1)\n",
    "print('accuracy :', (Recall_1+Recall_8)/2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "W7UtzI0uMUIH"
   },
   "source": [
    "This is our baseline for our binary classification task. Now your task will be to do better with convolutions!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "unXh3clNHoyM"
   },
   "source": [
    "## 4. Filters and convolutions\n",
    "\n",
    "Let start with this visual explanation of [Interactive image kernels](http://setosa.io/ev/image-kernels/)\n",
    "\n",
    "In some fields, convolution or filtering can be better understood as _correlations_. \n",
    "In practice we slide the filter matrix over the image (a bigger matrix) always selecting patches from the image with the same size as the filter. We compute the dot product between the filter and the image patch and store the scalar response which reflects the degree of similarity/correlation between the filter and image patch.\n",
    "\n",
    "Here is a simple 3x3 filter, ie a 3x3 matrix (see [Sobel operator](https://en.wikipedia.org/wiki/Sobel_operator) for more examples)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "OWiIsQ38HoyP"
   },
   "outputs": [],
   "source": [
    "top=[[-1,-1,-1],\n",
    "     [ 1, 1, 1],\n",
    "     [ 0, 0, 0]]\n",
    "\n",
    "plot(top)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "0dr63ciIMUIK"
   },
   "source": [
    "We now create a toy image, to understand how convolutions operate."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "jipouDFOHoyn"
   },
   "outputs": [],
   "source": [
    "cross = np.zeros((28,28))\n",
    "cross += np.eye(28)\n",
    "for i in range(4):\n",
    "    cross[12+i,:] = np.ones(28)\n",
    "    cross[:,12+i] = np.ones(28)\n",
    "\n",
    "plot(cross)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "g0j7I3OKHoyt"
   },
   "source": [
    "Our `top` filter should highlight top horizontal border in the image."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "zXc6nlfYHoyx"
   },
   "outputs": [],
   "source": [
    "from scipy.ndimage.filters import convolve, correlate\n",
    "\n",
    "corr_cross = correlate(cross,top)\n",
    "\n",
    "plot(corr_cross)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "5mi2j-hFHoy3"
   },
   "outputs": [],
   "source": [
    "?correlate"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "OdlQ_Kz8MUIS"
   },
   "source": [
    "What is done on the border of the image? \n",
    "\n",
    "## Padding\n",
    "\n",
    "![padding](https://dataflowr.github.io/notebooks/Module6/img/padding_conv.gif)\n",
    "\n",
    "source: [Convolution animations](https://github.com/vdumoulin/conv_arithmetic/blob/master/README.md)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "bPAEi-4gHoy-"
   },
   "outputs": [],
   "source": [
    "# to see the role of padding\n",
    "corr_cross = correlate(cross,top, mode='constant')\n",
    "plot(corr_cross)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "iSTy0Ka1HozC"
   },
   "outputs": [],
   "source": [
    "corrtop = correlate(images[5000], top)\n",
    "plot(corrtop)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "6kp8ArumHozJ"
   },
   "source": [
    "By rotating the filter with 90 degrees and calling the ```convolve``` function we get the same response as with the previously called ```correlate``` function."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "K2m8hh6XHozL"
   },
   "outputs": [],
   "source": [
    "np.rot90(top, 1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "MBratD5iHozV",
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "convtop = convolve(images[5000], np.rot90(top,2))\n",
    "plot(convtop)\n",
    "np.allclose(convtop, corrtop)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "XqFCxFXoHozd"
   },
   "source": [
    "Let's generate a few more variants of our simple 3x3 filter"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "A48nt8YqHoze"
   },
   "outputs": [],
   "source": [
    "straights=[np.rot90(top,i) for i in range(4)]\n",
    "plots(straights)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "cv5eeO7gHozn"
   },
   "source": [
    "We proceed similarly to generate a set of filters with a different behavior"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "xhJFADUoHozr"
   },
   "outputs": [],
   "source": [
    "br=[[ 0, 0, 1],\n",
    "    [ 0, 1,-1.5],\n",
    "    [ 1,-1.5, 0]]\n",
    "\n",
    "diags = [np.rot90(br,i) for i in range(4)]\n",
    "plots(diags)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "zhGJF9gNHoz0"
   },
   "source": [
    "We can compose filters to obtain more complex patterns"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "k9ETAdFKHoz1"
   },
   "outputs": [],
   "source": [
    "rots = straights + diags\n",
    "corrs_cross = [correlate(cross, rot) for rot in rots]\n",
    "plots(corrs_cross)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "5GnSnG1HHoz5"
   },
   "outputs": [],
   "source": [
    "rots = straights + diags\n",
    "corrs = [correlate(images[5000], rot) for rot in rots]\n",
    "plots(corrs)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "plV6mfsjHoz-"
   },
   "source": [
    "Next we illustrate the effect of downsampling.\n",
    "We select the most basic downsampling technique: __max pooling__. We keep only the maximum value for sliding windows of size ```7x7```.\n",
    "__Max pooling__ is a handy technique with a few useful perks:\n",
    "- since it selects the maximum values it ensures invariance to translations\n",
    "- reducing the size is helpful since data becomes more compact and easier to compare\n",
    "- we will see later in this course that since max pooling reduces the size of our images, the operations performed later on in the network have bigger receptive field / concern a bigger patch in the input image and allow the discovery of higher level patterns."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "QfR3u9FYHo0E"
   },
   "outputs": [],
   "source": [
    "import skimage\n",
    "\n",
    "from skimage.measure import block_reduce\n",
    "\n",
    "def pool(im): return block_reduce(im, (7,7), np.max)\n",
    "\n",
    "plots([pool(im) for im in corrs])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "Ud1VbsiRHo1s"
   },
   "source": [
    "We now build a classifier with convolutions.\n",
    "\n",
    "To this end we select a set of training images depicting _eights_ and _ones_, we convolve them with our set of filters, pool them and average them for each class and filter. We will thus obtain a set of _representative_ signatures for _eights_ and for _ones_. \n",
    "Given a new test image we compute its features by convolution and pooling with the same filters and then compare them with the _representative_ features. The class with the most _similar_ features is chosen as prediction.\n",
    "\n",
    "\n",
    "We keep 1000 images of _eight_ for the test set and use the remaining ones for the training: we convolve them with our bank of filters, perform max pooling on the responses and store them in ```pool8```."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "ptgnsYPmHo1u"
   },
   "outputs": [],
   "source": [
    "pool8 = [np.array([pool(correlate(im, rot)) for im in eights[1000:]]) for rot in rots]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "UoxKz3LNHo10"
   },
   "outputs": [],
   "source": [
    "len(pool8), pool8[0].shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "hUSDBnm9Ho13"
   },
   "source": [
    "We plot the result of the first filter+pooling on the first 5 _eights_ in our set. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "zlqkFm1RHo14"
   },
   "outputs": [],
   "source": [
    "plots(pool8[0][0:5])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "5i4oiuoeHo17"
   },
   "source": [
    "For the 4 first _eights_ in our set, we plot the result of the 8 filters+pooling"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "Av_HAmtJHo17"
   },
   "outputs": [],
   "source": [
    "plots([pool8[i][0] for i in range(8)])\n",
    "plots([pool8[i][1] for i in range(8)])\n",
    "plots([pool8[i][2] for i in range(8)])\n",
    "plots([pool8[i][3] for i in range(8)])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "zqDZeWY6Ho1-"
   },
   "source": [
    "We normalize the data in order to smoothen activations and bring them to similar ranges of values"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "D5wmCZY9Ho2A"
   },
   "outputs": [],
   "source": [
    "def normalize(arr): return (arr-arr.mean())/arr.std()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "5-bzBjC6Ho2D"
   },
   "source": [
    "Next we compute the average _eight_ by averaging all responses for each filter from _rots_."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "Ldr9Cpx5Ho2D"
   },
   "outputs": [],
   "source": [
    "filts8 = np.array([ims.mean(axis=0) for ims in pool8])\n",
    "filts8 = normalize(filts8)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "oZt78eGSHo2H"
   },
   "source": [
    "We should obtain a set of canonical _eights_ responses for each filter."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "_UdpsK6AHo2K"
   },
   "outputs": [],
   "source": [
    "plots(filts8)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "4HaAjI_uHo2O"
   },
   "source": [
    "We proceed similarly with training samples from the _one_ class and plot the canonical _ones_."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "e6baq8YKHo2P"
   },
   "outputs": [],
   "source": [
    "pool1 = [np.array([pool(correlate(im, rot)) for im in ones[1000:]]) for rot in rots]\n",
    "filts1 = np.array([ims.mean(axis=0) for ims in pool1])\n",
    "filts1 = normalize(filts1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "eq2CtFfAHo2T"
   },
   "outputs": [],
   "source": [
    "plots(filts1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "Bu97H3bNHo2Z"
   },
   "source": [
    "Do you notice any differences between ```filts8``` and ```filts1```? Which ones?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "Krvaio26Ho2a"
   },
   "source": [
    "We define a function that correlates a given image with all filters from ```rots``` and max pools the responses."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "kgnZzimPHo2a"
   },
   "outputs": [],
   "source": [
    "def pool_corr(im): return np.array([pool(correlate(im, rot)) for rot in rots])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "2vFjsSxcHo2c"
   },
   "outputs": [],
   "source": [
    "plots(pool_corr(eights[1000]))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "Td0bFQ04Ho2j"
   },
   "outputs": [],
   "source": [
    "#check \n",
    "plots([pool8[i][0] for i in range(8)])\n",
    "np.allclose(pool_corr(eights[1000]),[pool8[i][0] for i in range(8)])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "x2cJdyL2Ho2u"
   },
   "outputs": [],
   "source": [
    "# function used for a voting based classifier that will indicate which one of the \n",
    "# two classes is most likely given the sse distances\n",
    "# n2 comes from norm2\n",
    "# is8_n2 returns 1 if it thinks it's an eight and 0 otherwise\n",
    "def is8_n2(im): return 1 if sse(pool_corr(im),filts1) > sse(pool_corr(im),filts8) else 0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "SDFMHV3nHo2y"
   },
   "source": [
    "We perform a check to see if our function actually works. We correlate the an image of _eight_ with ```filts8``` and ```filts1```. It should give smaller distance for the _eights_."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "C_eFq1EWHo2z"
   },
   "outputs": [],
   "source": [
    "sse(pool_corr(eights[0]), filts8), sse(pool_corr(eights[0]), filts1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "yNINKzvKHo24"
   },
   "outputs": [],
   "source": [
    "plot(eights[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "roCmorVtHo29"
   },
   "source": [
    "We now test our classifier on the 1000 images of _eights_ and 1000 images of _ones_"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "TEPN_aPbHo3K"
   },
   "outputs": [],
   "source": [
    "nb_8_predicted_8, nb_1_predicted_8 = [np.array([is8_n2(im) for im in ims]).sum() for ims in [eights[:1000],ones[:1000]]]\n",
    "\n",
    "nb_8_predicted_1, nb_1_predicted_1 = [np.array([(1-is8_n2(im)) for im in ims]).sum() for ims in [eights[:1000],ones[:1000]]]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "FO4sLIjSHo3O"
   },
   "outputs": [],
   "source": [
    "Precisionf_8, Recallf_8, Precisionf_1, Recallf_1 = compute_scores(nb_8_predicted_8,nb_8_predicted_1,nb_1_predicted_1,nb_1_predicted_8)\n",
    "\n",
    "print('precision 8:', Precisionf_8, 'recall 8:', Recallf_8)\n",
    "print('precision 1:', Precisionf_1, 'recall 1:', Recallf_1)\n",
    "print('accuracy :', (Recallf_1+Recallf_8)/2)\n",
    "print('accuracy baseline:', (Recall_1+Recall_8)/2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "yoy--n6IHo3b"
   },
   "source": [
    "We improved the accuracy while reducing the embedding size from a $28\\times 28 = 784$ vector to a $4\\times 4\\times 8 = 128$ vector.\n",
    "\n",
    "We have successfully built a classifier for _eights_ and _ones_ using features extracted with a bank of pre-defined features and a set of training samples. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "collapsed": true,
    "id": "Y_mBZHv9Ho3d"
   },
   "source": [
    "## 5. Practicals: improving classification with Convolutional Neural Net\n",
    "\n",
    "You will now build a neural net that will learn the weights of the filters.\n",
    "\n",
    "The first layer of your network will be a convolutional layer with $8$ filters of size $3\\times 3$. Then you will apply a Max Pooling layer to reduce the size of the image to $4\\times 4$ as we did above. This will produce a (once flatten) a vector of size $128 = 4\\times 4\\times 8$. From this vector, you need to predict if the corresponding input is a $1$ or a $8$. So you are back to a classification problem as seen in previous lesson.\n",
    "\n",
    "You need to fill the code written below to construct your CNN. You will need to look for documentation about [torch.nn](https://pytorch.org/docs/stable/nn.html) in the Pytorch doc."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "F4ADp7YLHo3d"
   },
   "outputs": [],
   "source": [
    "import torch.nn as nn\n",
    "import torch.nn.functional as F\n",
    "\n",
    "class classifier(nn.Module):\n",
    "    \n",
    "    def __init__(self):\n",
    "        super(classifier, self).__init__()\n",
    "        # fill the missing entries below\n",
    "        self.conv1 = nn.Conv2d(in_channels=1, out_channels=8, kernel_size=3, padding=?)\n",
    "        self.fc = nn.Linear(in_features=128, out_features=2)\n",
    "        \n",
    "    def forward(self,x):\n",
    "        # implement your network here, use F.max_pool2d, F.log_softmax and do not forget to flatten your vector\n",
    "        x = self.conv1(x)\n",
    "        #\n",
    "        # Your code here\n",
    "        #\n",
    "        #\n",
    "        return x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "5YinNMu0Ho3g"
   },
   "outputs": [],
   "source": [
    "conv_class = classifier()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "EapjD0UQHo3k"
   },
   "source": [
    "Your code should work fine on a batch of 3 images."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "EtN49qaUHo3p"
   },
   "outputs": [],
   "source": [
    "batch_3images = train_set.data[0:2].type(torch.FloatTensor).resize_(3, 1, 28, 28)\n",
    "#conv_class(batch_3images)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "5s841xShHo3u"
   },
   "source": [
    "The following lines of code implement a data loader for the train set and the test set. No modification is needed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "iO_mo1CaHo3u"
   },
   "outputs": [],
   "source": [
    "bs = 64\n",
    "\n",
    "l8 = np.array(0)\n",
    "eights_dataset = [[torch.from_numpy(e.astype(np.float32)).unsqueeze(0), torch.from_numpy(l8.astype(np.int64))] for e in eights]\n",
    "l1 = np.array(1)\n",
    "ones_dataset = [[torch.from_numpy(e.astype(np.float32)).unsqueeze(0), torch.from_numpy(l1.astype(np.int64))] for e in ones]\n",
    "train_dataset = eights_dataset[1000:] + ones_dataset[1000:]\n",
    "test_dataset = eights_dataset[:1000] + ones_dataset[:1000]\n",
    "\n",
    "train_loader = torch.utils.data.DataLoader(train_dataset,\n",
    "    batch_size=bs, shuffle=True)\n",
    "test_loader = torch.utils.data.DataLoader(test_dataset,\n",
    "    batch_size=bs, shuffle=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "LVzpT2LvHo3x"
   },
   "source": [
    "You need now to code the training loop. Store the loss and accuracy for each epoch."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "jsu7KRMIHo3y"
   },
   "outputs": [],
   "source": [
    "def train(model,data_loader,loss_fn,optimizer,n_epochs=1):\n",
    "    model.train(True)\n",
    "    loss_train = np.zeros(n_epochs)\n",
    "    acc_train = np.zeros(n_epochs)\n",
    "    for epoch_num in range(n_epochs):\n",
    "        running_corrects = 0.0\n",
    "        running_loss = 0.0\n",
    "        size = 0\n",
    "\n",
    "        for data in data_loader:\n",
    "            inputs, labels = data\n",
    "            bs = labels.size(0)\n",
    "            #\n",
    "            #\n",
    "            # Your code here\n",
    "            #\n",
    "            #\n",
    "            size += bs\n",
    "        epoch_loss = running_loss.item() / size\n",
    "        epoch_acc = running_corrects.item() / size\n",
    "        loss_train[epoch_num] = epoch_loss\n",
    "        acc_train[epoch_num] = epoch_acc\n",
    "        print('Train - Loss: {:.4f} Acc: {:.4f}'.format(epoch_loss, epoch_acc))\n",
    "    return loss_train, acc_train"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "5Es_nmD2Ho34"
   },
   "outputs": [],
   "source": [
    "conv_class = classifier()\n",
    "# choose the appropriate loss\n",
    "loss_fn = \n",
    "# your SGD optimizer\n",
    "learning_rate = 1e-3\n",
    "optimizer_cl = \n",
    "# and train for 10 epochs\n",
    "l_t, a_t = train(conv_class,train_loader,loss_fn,optimizer_cl,n_epochs = 10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "JPG9BbRKHo39"
   },
   "source": [
    "Let's learn for 10 more epochs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "80ABV57nHo3-"
   },
   "outputs": [],
   "source": [
    "l_t1, a_t1 = train(conv_class,train_loader,loss_fn,optimizer_cl,n_epochs = 10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "P960YqN0Ho4J"
   },
   "source": [
    "Our network seems to learn but we now need to check its accuracy on the test set."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "f_ttkHKAHo4K"
   },
   "outputs": [],
   "source": [
    "def test(model,data_loader):\n",
    "    model.train(False)\n",
    "\n",
    "    running_corrects = 0.0\n",
    "    running_loss = 0.0\n",
    "    size = 0\n",
    "\n",
    "    for data in data_loader:\n",
    "        inputs, labels = data\n",
    "            \n",
    "        bs = labels.size(0)\n",
    "        #\n",
    "        # Your code here\n",
    "        #\n",
    "        size += bs\n",
    "\n",
    "    print('Test - Loss: {:.4f} Acc: {:.4f}'.format(running_loss / size, running_corrects.item() / size))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "nNB9RU20Ho4M"
   },
   "outputs": [],
   "source": [
    "test(conv_class,test_loader)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "1ZUVmjBOHo4O"
   },
   "source": [
    "Change the optimizer to Adam."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "eqLRDPTxHo4P"
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "VnXvEogNHo4S"
   },
   "source": [
    "How many parameters did your network learn?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "Qd5JSwZDHo4S"
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "lfxpzNo8Ho4U"
   },
   "source": [
    "You can see them as follows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "XkpdG05OHo4U"
   },
   "outputs": [],
   "source": [
    "for m in conv_class.children():\n",
    "    print('weights :', m.weight.data)\n",
    "    print('bias :', m.bias.data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "rjXp3y5HHo4W"
   },
   "outputs": [],
   "source": [
    "for m in conv_class.children():\n",
    "    T_w = m.weight.data.numpy()\n",
    "    T_b = m.bias.data.numpy()\n",
    "    break"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "S-g2xTm-Ho4X"
   },
   "outputs": [],
   "source": [
    "plots([T_w[i][0] for i in range(8)])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "PlvjNVKhHo4Z"
   },
   "outputs": [],
   "source": [
    "T_b"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "id": "XORiuyWzHo4a"
   },
   "source": [
    "[![Dataflowr](https://raw.githubusercontent.com/dataflowr/website/master/_assets/dataflowr_logo.png)](https://dataflowr.github.io/website/)"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "include_colab_link": true,
   "name": "06_convolution_digit_recognizer.ipynb",
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
